Date: Tue, 10 Dec 1996 03:27:43 GMT
Server: NCSA/1.4.2
Content-type: text/html
Last-modified: Thu, 03 Oct 1996 03:34:42 GMT
Content-length: 8085

<HTML>

<HEAD>

<TITLE>Etch Overview</TITLE>

<BASE HREF="http://www.cs.washington.edu/homes/bershad/etch/index.html">
<META NAME="GENERATOR" CONTENT="Internet Assistant for Microsoft Word 2.0z">
</HEAD>
<BODY bgcolor="#ffffff">

<H1>Instrumentation and Optimization of WIN32/Intel Executables
</H1>

<P>
<B>Etch</B> is an application program performance evaluation and
optimization system, developed for Intel x86 platforms running
Windows/95, Windows/NT and Linux operating systems. The system
allows you annotate existing binaries with arbitrary instructions
(for example, to trace, or perform coverage analysis), or to rewrite
an existing binary so that it executes more efficiently. 
<P>
Etch works directly on x86 executables. It does <EM>not</EM> require
program source code for either measurement or optimization. 
<P>
<h2> Some Results</H2>
If you'd like to see some traces we've generated from a few popular
Windows programs on X86, click
<A HREF="http://etch.eecs.harvard.edu/traces/index.html">here.</A>


<H2>Who uses Etch? </H2>

<P>
Etch is targeted at two different user groups: <EM>developers</EM>,
who wish to understand the performance of their programs during
the development cycle, and <EM>users</EM>, who wish to understand
and improve the performance of common applications executing in
their environment. 
<P>
Etch provides both groups with measurement tools to evaluate performance
at several levels of detail, and optimization tools to automatically
restructure programs to improve performance, where possible. 
<H2>How Etch Works </H2>

<P>
Etch reads executable binaries (and, under Win32, DLLs) for an
application, modifies the image, and writes a new one that has
been enhanced for measurement or optimization. The transformations
performed on the binary by Etch do not change program correctness,
although a program transformed for performance measurement collection
will run more slowly. Etch does not require changes to the operating
system, but a modified Etch binary may utilize OS facilities,
such as software timers, or even implementation-specific facilities,
such as Intel Pentium performance counters. 
<H2>Using Etch </H2>

<P>
There are three key concepts in using Etch: 
<DL>
<DT>Instrumentation
<DD>Instrumentation transforms a binary according to an arbitrary
criteria. For example, a program may be instrumented to count
instructions, or to count the occurrence of each instruction,
or to simulate a cache by tracking memory references. 
<DT>Data collection
<DD>Once instrumented, an executable can be run. At that time,
instrumentation routines collect data about the program. 
<DT>Data processing
<DD>Once run, any data generated by an instrumented executable
can be processed. Trace-based optimization is a typical data processing
phase made possible by Etch. 
</DL>

<H3>Instrumentation </H3>

<P>
To instrument a program, Etch is invoked with the name of an executable
and a DLL. The DLL provides a set of routines which are invoked
for each instruction in the executable. Roughly, Etch operates
as: 
<PRE>
<FONT SIZE=2>        for each instruction in executable
                InstrumentBefore(instruction);  
                InstrumentAfter(instruction);
        end;
        InstrumentBeforeMainProgram();
        InstrumentAfterMainProgram();
 </FONT>
</PRE>

<P>
The instrumentation tool provides implementations of these &quot;Before&quot;
and &quot;After&quot; functions. The call back functions can in
turn direct Etch to modify the executable with respect to the
specific instruction. The directions in effect say &quot;before
(or after) this instruction runs, please call some specific function
with some specific set of arguments.&quot; For example, to count
instructions, the InstrumentBefore procedure would direct Etch
to insert code that incremented a counter at runtime. These inserted
instructions do not change the correctness of the program. 
<P>
Once the entire executable has been scanned and instrumented,
Etch writes a new version of the executable that can be run. Any
functions referenced in the callback routines, as well as the
Etch runtime library are included in the new executable. <BR>

<H3>Data Collection </H3>

<P>
The executable written by Etch can be run, and any instrumentation
routines will run as a side effect of running the program. Instrumentation
routines, as the program is running, can inspect the state of
the program, for example, the contents of registers, or effective
addresses. All addresses, whether text or data, are relative to
the original binary, so the collection routines do not have to
compensate for the fact that they are part of a modified executable.
<BR>

<H3>Data Processing </H3>

<P>
When an Etched program terminates, its data collection routines
can save information about the executable to disk. Later, post-processing
utilities can examine the data. For example, a predicted execution
time can be determined after the fact based on hypothetical processor,
cache, and memory speeds.  At a lower level, detailed information
about a program's performance can be obtained such as is shown
below in the graph of instruction cache performance for a collection
of popular Win32 programs.   The graph shows the miss penalty
of the first level instruction cache and a second level unified
cache for the Perl interpreter, three commercial C++ compilers,
and MS-WORD.<BR>

<H4><IMG SRC="IMG00001.GIF"><BR>
</H4>

<H4>Optimization </H4>

<P>
Etch also provides facilities for rewriting an executable in order
to improve its performance. For example, the instrumentation phase,
rather than adding new instructions, can direct Etch to write
the executable out according to a different code layout optimized
for cache and VM behavior. 
<H4>The impact of optimization</H4>

<P>
The graph below shows the reduction in instruction cache misses
and execution time (in cycles) for a collection of popular Win32
programs that have been optimized for code layout using Etch on
a 90Mhz Pentium.   Etch was first used to discover the programs'
locality while executing against a training set, and then rewritten
in order to achieve a tighter cache and VM packing.  Infrequently
executed basic blocks were moved out of line, and frequently interacting
basic blocks were laid out contiguously in the executable.  The
results were measured using inputs different than those used during
training.<BR>

<P>
<IMG SRC="IMG00002.GIF"><BR>

<H2>The User Interface </H2>

<P>
In addition to a programming interface, Etch also offers a graphical
user interface for performing common instrumentation and optimization
operations. The user interface can drive the measurement process:
it runs Etch on the original binary to produce a new binary, modified
to collect the necessary behavioral data; it executes the modified
binary to produce the data; and it feeds the data to analysis
tools that produce graphs or charts that help to pinpoint problems.
Once a problem has been identified, the user may instruct Etch
to perform a performance-optimization transformation. For example,
Etch may rewrite the original binary to change the layout of data
or code in order to improve cache or virtual memory performance.
  <BR>

<H3>Sample dialog box from the user interface<BR>
</H3>

<P>
<IMG SRC="IMG00003.GIF"><BR>
<BR>

<H3>Sample results showing distribution of instruction opcodes
<BR>
</H3>

<P>
<IMG SRC="IMG00004.GIF"><BR>

<H2>Requirements </H2>

<P>
Etch runs on Intel 486, Pentium and P6 processors with at least
 24 MB of memory. Etch works on 32-bit (Win32) binaries. It has
been used for programs built by MSVC, Borland, and Intel compilers.
  
<H2>Availability </H2>

<P>
If you are interested in obtaining more information about Etch,
please contact <A href="mailto:etch-info@cs.washington.edu">etch-info@cs.washington.edu</A>

<H2>Project Members </H2>

<P>
Etch is due to the efforts of people at Harvard University and
the University of Washington. These include: Dennis Lee, Ted Romer,
Geoff Voelker, Alec Wolman, Wayne Wong, Brad Chen, Brian Bershad,
and Hank Levy. 
</BODY>

</HTML>
